StatAlign 2.0: combining statistical alignment with RNA secondary structure prediction

نویسندگان

  • Preeti Arunapuram
  • Ingolfur Edvardsson
  • Michael Golden
  • James W. J. Anderson
  • Ádám Novák
  • Zsuzsanna Sükösd
  • Jotun Hein
چکیده

MOTIVATION Comparative modeling of RNA is known to be important for making accurate secondary structure predictions. RNA structure prediction tools such as PPfold or RNAalifold use an aligned set of sequences in predictions. Obtaining a multiple alignment from a set of sequences is quite a challenging problem itself, and the quality of the alignment can affect the quality of a prediction. By implementing RNA secondary structure prediction in a statistical alignment framework, and predicting structures from multiple alignment samples instead of a single fixed alignment, it may be possible to improve predictions. RESULTS We have extended the program StatAlign to make use of RNA-specific features, which include RNA secondary structure prediction from multiple alignments using either a thermodynamic approach (RNAalifold) or a Stochastic Context-Free Grammars (SCFGs) approach (PPfold). We also provide the user with scores relating to the quality of a secondary structure prediction, such as information entropy values for the combined space of secondary structures and sampled alignments, and a reliability score that predicts the expected number of correctly predicted base pairs. Finally, we have created RNA secondary structure visualization plugins and automated the process of setting up Markov Chain Monte Carlo runs for RNA alignments in StatAlign. AVAILABILITY AND IMPLEMENTATION The software is available from http://statalign.github.com/statalign/.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Novel representation of RNA secondary structure used to improve prediction algorithms.

We propose a novel representation of RNA secondary structure for a quick comparison of different structures. Secondary structure was viewed as a set of stems and each stem was represented by two values according to its position. Using this representation, we improved the comparative sequence analysis method results and the minimum free-energy model. In the comparative sequence analysis met...

متن کامل

A folding algorithm for extended RNA secondary structures

MOTIVATION RNA secondary structure contains many non-canonical base pairs of different pair families. Successful prediction of these structural features leads to improved secondary structures with applications in tertiary structure prediction and simultaneous folding and alignment. RESULTS We present a theoretical model capturing both RNA pair families and extended secondary structure motifs ...

متن کامل

Annual Poster Presentation on November 21 , 2008

Codon usage bias refers to differences among organisms in the frequency of occurrence of codons in protein-coding DNA sequences. This bias in codon preference has been reported in most genomes that have been studied so far. In some organisms, highly expressed genes have a strong codon preference that is consistent with the concentrations of corresponding tRNAs, whereas genes expressed at a lowe...

متن کامل

DAFS: simultaneous aligning and folding of RNA sequences via dual decomposition

MOTIVATION It is well known that the accuracy of RNA secondary structure prediction from a single sequence is limited, and thus a comparative approach that predicts a common secondary structure from aligned sequences is a better choice if homologous sequences with reliable alignments are available. However, correct secondary structure information is needed to produce reliable alignments of RNA ...

متن کامل

Simultaneous alignment and structure prediction of three RNA sequences

Comparative RNA sequence analyses have contributed remarkably accurate predictions. The recent determination of the 30S and 50S ribosomal subunits bringing more supporting evidence. Several inference tools are combining free energy minimisation and comparative analysis to improve the quality of secondary structure predictions. This paper investigates the following hypotheses: the use of three i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Bioinformatics

دوره 29 5  شماره 

صفحات  -

تاریخ انتشار 2013